Define a function max() that takes two numbers as arguments and returns the largest of them.
In [2]:
def max(number1, number2):
if number1 > number2:
return number1
else:
return number2
max(1, 2)
Out[2]:
In [3]:
max(100, 10)
Out[3]:
Write a function find_longest_word() that takes a list of words and returns the length of the longest one
In [13]:
def find_longest_word(word_list):
max_word=0
for w in word_list:
if len(w) > max_word:
max_word = len(w)
return max_word
l = ['uydfguyfg', 'ffefu', "kuurhr", "hggug", "hgrhggrel"]
find_longest_word(l)
Out[13]:
Write a function filter_long_words() that takes a list of words and an integer n and returns the list of words that are longer than n.
In [7]:
words_list = ['banana', 'apple', 'orange', 'elephant', 'raspberry']
min_length = 7
# with list comprehension: elegant and concise
[word for word in words_list if len(word) >= min_length]
Out[7]:
In [8]:
# with a function: more explicit and simpler
def filter_long_words(words_list, min_length):
output = []
for word in words_list:
if len(word) >= min_length:
output.append(word)
return output
In [9]:
filter_long_words(words_list, 7)
Out[9]:
Define a function generate_n_chars() that takes an integer n and a character c and returns a string, n characters long, consisting only of the chosen character. For example, generate_n_chars(5,"x") should return the string "xxxxx".
In [1]:
def generate_n_chars(n, c):
return c * n
generate_n_chars(5, 'x')
Out[1]:
Write a program that takes list of words and returns a dictionary with the words as keys and their length as values.
In [2]:
def generate_dictionary(words):
dictionary = {}
for word in words:
dictionary[word] = len(word)
return dictionary
generate_dictionary(['python', 'blast', 'banana'])
Out[2]:
A pangram is a sentence that contains all the letters of the English alphabet at least once, for example: The quick brown fox jumps over the lazy dog. Your task here is to write a function to check a sentence to see if it is a pangram or not.
In [4]:
# you can create a set from a string
set('abcdef')
Out[4]:
In [15]:
a = set('abc')
a.issubset(set('abcd'))
Out[15]:
In [20]:
def check_pangram(sentence):
letters = set('abcdefghijklmnopqrstuvwxyz')
found = set()
for char in sentence.lower():
found.add(char)
# check if our letters is a subset of all english letters
if letters < found:
return True
else:
return False
In [21]:
check_pangram('The quick brown fox jumps over the lazy dog')
Out[21]:
"99 Bottles of Beer" is a traditional song in the United States and Canada. It is popular to sing on long trips, as it has a very repetitive format which is easy to memorize, and can take a long time to sing. The song's simple lyrics are as follows:
99 bottles of beer on the wall, 99 bottles of beer.
Take one down, pass it around, 98 bottles of beer on the wall.
The same verse is repeated, each time with one fewer bottle. The song is completed when the singer or singers reach zero.
Your task here is write a function capable of generating all the verses of the song.
In [25]:
def sing():
verse1 = '{0} bottles of beer on the wall, {0} bottles of beer.'
verse2 = 'Take one down, pass it around, {0} bottles of beer on the wall.'
bottles = 99
while bottles > 0:
print(verse1.format(bottles))
print(verse2.format(bottles-1))
bottles -= 1
In [26]:
sing()
Write a function char_freq() that takes a string and builds a frequency listing of the characters contained in it. Represent the frequency listing as a dictionary. Try it with something like char_freq("abbabcbdbabdbdbabababcbcbab").
In [31]:
def char_freq(word):
dictionary = {}
for letter in word:
if letter not in dictionary:
dictionary[letter] = 1
else:
dictionary[letter] += 1
return dictionary
char_freq('abbabcbdbabdbdbabababcbcbab')
Out[31]:
In [29]:
def char_freq(word):
dictionary = {}
for letter in set(word):
dictionary[letter] = word.count(letter)
return dictionary
char_freq('abbabcbdbabdbdbabababcbcbab')
Out[29]:
Write a function that will calculate the average word length of a text stored in a file (i.e the sum of all the lengths of the word tokens in the text, divided by the number of word tokens). Add an option to exclude blank lines or chapter headers from the computation.
Use the aristotle.txt file contained in the data directory
In [32]:
def average_word_length(input_file, skip=False):
words = []
for line in open(input_file, 'r'):
# remove the newline character
line = line.rstrip()
if skip is True:
# Only one character means empty line
if len(set(line)) == 1:
continue
# go word by word
for word in line.split():
# skip "empty" words
if len(word) == 0:
continue
words.append(len(word))
total_words_length = 0
for word_len in words:
total_words_length += word_len
return float(total_words_length)/len(words)
In [33]:
average_word_length('../data/aristotle.txt')
Out[33]:
In [34]:
average_word_length('../data/aristotle.txt', skip=True)
Out[34]:
An anagram is a type of word play, the result of rearranging the letters of a word or phrase to produce a new word or phrase, using all the original letters exactly once; e.g., orchestra = carthorse. Using the word list in the unixdict.txt file (data directory), write a program that finds the sets of words that share the same characters that contain the most words in them.
In [39]:
def anagrams(infile):
words = {}
for line in open(infile):
word = line.rstrip()
# dictionary is KEY:VALUE
# key is list of unique letters
# value is a set of words
words[tuple(set(word))] = words.get(tuple(set(word)), set())
words[tuple(set(word))].add(word)
# return the longest set of words
longest = set()
for key, value in words.items():
if len(value) > len(longest):
longest = value
return longest
In [41]:
def anagrams(infile):
words = {}
for line in open(infile):
word = line.rstrip()
# dictionary is KEY:VALUE
# key is list of unique letters
# value is a set of words
words[tuple(set(word))] = words.get(tuple(set(word)), set())
words[tuple(set(word))].add(word)
# return the longest set of words
return sorted(words.values(), key=lambda x: len(x))[-1]
In [42]:
anagrams('../data/unixdict.txt')
Out[42]: